AITopics

2510.04114

Country: North America > Canada (0.14)

Genre: Research Report (1.00)

Industry:

Education (0.66)
Law (0.46)
Banking & Finance > Real Estate (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Chen, Marian, Zilka, Miri

Learning Pareto-Optimal Pandemic Intervention Policies with MORL

arXiv.org Artificial IntelligenceOct-7-2025

The COVID-19 pandemic underscored a critical need for intervention strategies that balance disease containment with socioeconomic stability. We approach this challenge by designing a framework for modeling and evaluating disease-spread prevention strategies. Our framework leverages multi-objective reinforcement learning (MORL) - a formulation necessitated by competing objectives - combined with a new stochastic differential equation (SDE) pandemic simulator, calibrated and validated against global COVID-19 data. Our simulator reproduces national-scale pandemic dynamics with orders of magnitude higher fidelity than other models commonly used in reinforcement learning (RL) approaches to pandemic intervention. Training a Pareto-Conditioned Network (PCN) agent on this simulator, we illustrate the direct policy trade-offs between epidemiological control and economic stability for COVID-19. Furthermore, we demonstrate the framework's generality by extending it to pathogens with different epidemiological profiles, such as polio and influenza, and show how these profiles lead the agent to discover fundamentally different intervention policies. To ground our work in contemporary policymaking challenges, we apply the model to measles outbreaks, quantifying how a modest 5% drop in vaccination coverage necessitates significantly more stringent and costly interventions to curb disease spread. This work provides a robust and adaptable framework to support transparent, evidence-based policymaking for mitigating public health crises.

intervention, machine learning, reinforcement learning, (18 more...)

2510.0334

Country: Europe > United Kingdom (0.68)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)

Liautaud, Paul, Gaillard, Pierre, Wintenberger, Olivier

Minimax Adaptive Boosting for Online Nonparametric Regression

arXiv.org Machine LearningOct-4-2024

We study boosting for adversarial online nonparametric regression with general convex losses. We first introduce a parameter-free online gradient boosting (OGB) algorithm and show that its application to chaining trees achieves minimax optimal regret when competing against Lipschitz functions. While competing with nonparametric function classes can be challenging, the latter often exhibit local patterns, such as local Lipschitzness, that online algorithms can exploit to improve performance. By applying OGB over a core tree based on chaining trees, our proposed method effectively competes against all prunings that align with different Lipschitz profiles and demonstrates optimal dependence on the local regularities. As a result, we obtain the first computationally efficient algorithm with locally adaptive optimal rates for online regression in an adversarial setting.

algorithm, ctober 7, preprint, (15 more...)

2410.03363

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.84)

arXiv.org Artificial IntelligenceOct-3-2024

Compute Or Load KV Cache? Why Not Both?

Jin, Shuowei, Liu, Xueshen, Zhang, Qingzhao, Mao, Z. Morley

Recent advancements in Large Language Models (LLMs) have significantly increased context window sizes, enabling sophisticated applications but also introducing substantial computational overheads, particularly computing key-value (KV) cache in the prefill stage. Prefix caching has emerged to save GPU power in this scenario, which saves KV cache at disks and reuse them across multiple queries. However, traditional prefix caching mechanisms often suffer from substantial latency because the speed of loading KV cache from disks to GPU memory is bottlenecked by the throughput of I/O devices. To optimize the latency of long-context prefill, we propose Cake, a novel KV cache loader, which employs a bidirectional parallelized KV cache generation strategy. Upon receiving a prefill task, Cake simultaneously and dynamically loads saved KV cache from prefix cache locations and computes KV cache on local GPUs, maximizing the utilization of available computation and I/O bandwidth resources. Additionally, Cake automatically adapts to diverse system statuses without manual parameter. tuning. In experiments on various prompt datasets, GPUs, and I/O devices, Cake offers up to 68.1% Time To First Token (TTFT) reduction compare with compute-only method and 94.6% TTFT reduction compare with I/O-only method.

kv cache, large language model, machine learning, (18 more...)

2410.03065

Country:

North America > United States > Michigan (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Giudice, Gianluca, Geneletti, Sara, Kalogeropoulos, Konstantinos

Inference on Causal Effects of Interventions in Time using Gaussian Processes

arXiv.org Machine LearningOct-6-2022

Recently, many applications have been devoted to understanding and revealing causal rather than associative relations among variables. One approach in the context of time series is that of synthetic controls (Abadie and Gardeazabal, 2003) and various extensions. This is based on the idea of recovering the counterfactual outcome that would have been observed had an intervention not taken place. This article contributes to expanding and generalizing this class of models, allowing for non-linearity in a nonparametric manner through Gaussian Processes. These models have high degree of flexibility in building the counterfactual outcome, using all types of information and without any limitations on the functional form. They also make it possible to assess the robustness of the synthetic controls, as we can use the posterior distributions of the Gaussian Processes to quantify uncertainty stemming from the functional form estimation. Lastly, as the models learn the relationships which prevail amongst all associated variables, there is no need to match the time series on a calendar basis, making the most of the available data.

artificial intelligence, machine learning, modeling & simulation, (21 more...)

2210.0285

Country:

Europe > France (0.04)
Europe > Portugal (0.04)
Asia > Middle East > Jordan (0.04)
(13 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Vaccines (0.70)
(2 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Bobadilla-Suarez, Sebastian, Jones, Matt, Love, Bradley C.

Robust priors for regularized regression

arXiv.org Machine LearningOct-6-2020

To whom correspondence should be addressed; Email: sebastian.suarez.12@ucl.ac.uk. Penalized regression approaches, like ridge regression, shrink weights toward zero but zero association is usually not a sensible prior. Inspired by simple and robust decision heuristics humans use, we constructed nonzero priors for penalized regression models that provide robust and interpretable solutions across several tasks. Our approach enables estimates from a constrained model to serve as a prior for a more general model, yielding a principled way to interpolate between models of differing complexity. We successfully applied this approach to a number of decision and classification problems, as well as analyzing simulated brain imaging data. Models with robust priors had excellent worstcase performance. Solutions followed from the form of the heuristic that was used to derive the prior. These new algorithms can serve applications in data analysis and machine learning, as well as help in understanding how people transition from novice to expert performance. Inference from data is most successful when it involves a helpful inductive bias or prior belief. Regularized regression approaches, such as ridge regression, incorporate a penalty term that complements the fit term by providing a constraint on the solution, akin to how Occam's razor favors solutions that both fit the observed data and are simple.

artificial intelligence, machine learning, regression, (18 more...)

2010.0261

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > Wisconsin (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology (0.90)
Health & Medicine > Diagnostic Medicine > Imaging (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.91)

Kirchhoff, Dominik, Kuhnt, Sonja

Gaussian Process Models with Low-Rank Correlation Matrices for Both Continuous and Categorical Inputs

arXiv.org Machine LearningOct-6-2020

We introduce a method that uses low-rank approximations of cross-correlation matrices in mixed continuous and categorical Gaussian Process models. This new method -- called Low-Rank Correlation (LRC) -- offers the ability to flexibly adapt the number of parameters to the problem at hand by choosing an appropriate rank of the approximation. Furthermore, we present a systematic approach of defining test functions that can be used for assessing the accuracy of models or optimization methods that are concerned with both continuous and categorical inputs. We compare LRC to existing approaches of modeling the cross-correlation matrix. It turns out that the new approach performs well in terms of estimation of cross-correlations and response surface prediction. Therefore, LRC is a flexible and useful addition to existing methods, especially for increasing numbers of combinations of levels of the categorical inputs.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

2010.02574

Country: Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

De Meulemeester, Hannes, Schreurs, Joachim, Fanuel, Michaël, De Moor, Bart, Suykens, Johan A. K.

The Bures Metric for Taming Mode Collapse in Generative Adversarial Networks

arXiv.org Machine LearningOct-6-2020

Generative Adversarial Networks (GANs) are performant generative methods yielding high-quality samples. However, under certain circumstances, the training of GANs can lead to mode collapse or mode dropping, i.e. the generative models not being able to sample from the entire probability distribution. To address this problem, we use the last layer of the discriminator as a feature map to study the distribution of the real and the fake data. During training, we propose to match the real batch diversity to the fake batch diversity by using the Bures distance between covariance matrices in feature space. The computation of the Bures distance can be conveniently done in either feature space or kernel space in terms of the covariance and kernel matrix respectively. We observe that diversity matching reduces mode collapse substantially and has a positive effect on the sample quality. On the practical side, a very simple training procedure, that does not require additional hyperparameter tuning, is proposed and assessed on several datasets.

architecture, artificial intelligence, machine learning, (17 more...)

2006.09096

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Barros, Gabriel Moraes, Colombini, Esther Luna

Using Soft Actor-Critic for Low-Level UAV Control

arXiv.org Artificial IntelligenceOct-5-2020

Unmanned Aerial Vehicles (UAVs), or drones, have recently been used in several civil application domains from organ delivery to remote locations to wireless network coverage. These platforms, however, are naturally unstable systems for which many different control approaches have been proposed. Generally based on classic and modern control, these algorithms require knowledge of the robot's dynamics. However, recently, model-free reinforcement learning has been successfully used for controlling drones without any prior knowledge of the robot model. In this work, we present a framework to train the Soft Actor-Critic (SAC) algorithm to low-level control of a quadrotor in a go-to-target task. All experiments were conducted under simulation. With the experiments, we show that SAC can not only learn a robust policy, but it can also cope with unseen scenarios. Videos from the simulations are available in https://www.youtube.com/watch?v=9z8vGs0Ri5g and the code in https://github.com/larocs/SAC_uav.

machine learning, reinforcement learning, trajectory, (15 more...)

2010.02293

Country:

South America > Brazil > São Paulo > Campinas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.48)

arXiv.org Machine LearningOct-4-2019

PPGAN: Privacy-preserving Generative Adversarial Network

Liu, Yi, Peng, Jialiang, Yu, James J. Q, Wu, Yi

Generative Adversarial Network (GAN) and its variants serve as a perfect representation of the data generation model, providing researchers with a large amount of high-quality generated data. They illustrate a promising direction for research with limited data availability. When GAN learns the semantic-rich data distribution from a dataset, the density of the generated distribution tends to concentrate on the training data. Due to the gradient parameters of the deep neural network contain the data distribution of the training samples, they can easily remember the training samples. When GAN is applied to private or sensitive data, for instance, patient medical records, as private information may be leakage. To address this issue, we propose a Privacy-preserving Generative Adversarial Network (PPGAN) model, in which we achieve differential privacy in GANs by adding well-designed noise to the gradient during the model learning procedure. Besides, we introduced the Moments Accountant strategy in the PPGAN training process to improve the stability and compatibility of the model by controlling privacy loss. We also give a mathematical proof of the differential privacy discriminator. Through extensive case studies of the benchmark datasets, we demonstrate that PPGAN can generate high-quality synthetic data while retaining the required data available under a reasonable privacy budget.

differential privacy, ppgan, privacy, (12 more...)

1910.02007

Country: Asia > China (0.05)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)